A New Similarity Measure with Length Factor for Plagiarism Detection
نویسندگان
چکیده
Different similarity measures are available for comparison of textual data. These similarity measures are used for plagiarism detection. This research paper proposes a new similarity measure. Moreover, this paper proposes to consider length of content for plagiarism score determination. General Terms Data mining, plagiarism detection.
منابع مشابه
The Encoplot Similarity Measure for Automatic Detection of Plagiarism - Notebook for PAN at CLEF 2011
This paper describes the evolution of our method Encoplot for automatic plagiarism detection and the results of the participation to the PAN’11 competition. The main novelties are the introduction of a new similarity measure and of a new ranking method, which cooperate to rank much better the source– suspicious document pairs when selecting the candidates for the detailed analysis phase. We hav...
متن کاملEnglish-Persian Plagiarism Detection based on a Semantic Approach
Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...
متن کاملAdaptive Algorithm for Plagiarism Detection: The Best-Performing Approach at PAN 2014 Text Alignment Competition
The task of (monolingual) text alignment consists in finding similar text fragments between two given documents. It has applications in plagiarism detection, detection of text reuse, author identification, authoring aid, and information retrieval, to mention only a few. We describe our approach to the text alignment subtask of the plagiarism detection competition at PAN 2014, which resulted in ...
متن کاملطراحی سامانۀ تشخیص دستبرد ادبی جملهبنیاد در متون فارسی به کمک همجوشی گواهها
Today, there are many documents on Internet, such that users can generate new documents by coping them and existing Plagiarism Detection systems (PDS) couldn't detect all kind of plagiarism. The main challenge is finding a suitable algorithm to improving the amount of similar documents and their assessing time. It’s difficult to do assessing similarity in Persian texts that different characteri...
متن کاملA Winning Approach to Text Alignment for Text Reuse Detection at PAN 2014
The task of (monolingual) text alignment consists in finding similar text fragments between two given documents. It has applications in plagiarism detection, detection of text reuse, author identification, authoring aid, and information retrieval, to mention only a few. We describe our approach to the text alignment subtask at PAN 2014 plagiarism detection competition. Our method relies on a se...
متن کامل